Skip to content

Installed the fasta_classify_catpack subworkflow#38

Merged
KateSakharova merged 4 commits intodevfrom
feature/catpack-for-metagenomic-taxa-classification
Mar 16, 2026
Merged

Installed the fasta_classify_catpack subworkflow#38
KateSakharova merged 4 commits intodevfrom
feature/catpack-for-metagenomic-taxa-classification

Conversation

@KateSakharova
Copy link
Contributor

@KateSakharova KateSakharova commented Mar 13, 2026

Taxonomy part

  • Added subwf fasta_classify_catpack from nf-core
  • change extension for all fasta files without taxonomy lineage to unique .fasta extension required for CAT_pack as input

@nf-core-bot
Copy link
Member

Warning

Newer version of the nf-core template is available.

Your pipeline is using an old version of the nf-core template: 3.5.1.
Please update your pipeline to the latest version.

For more documentation on how to update your pipeline, please see the nf-core documentation and Synchronisation documentation.

@github-actions
Copy link

github-actions bot commented Mar 13, 2026

nf-core pipelines lint overall result: Passed ✅ ⚠️

Posted for pipeline commit 3cd524a

+| ✅ 204 tests passed       |+
#| ❔   7 tests were ignored |#
#| ❔   1 tests had warnings |#
!| ❗  24 tests had warnings |!
Details

❗ Test warnings:

  • readme - README contains the placeholder zenodo.XXXXXXX. This should be replaced with the zenodo doi (after the first release).
  • pipeline_todos - TODO string in README.md: Add citation for pipeline after first release. Uncomment lines below and update Zenodo doi and badge at the top of this file.
  • pipeline_todos - TODO string in README.md: Add bibliography of tools and data used in your pipeline
  • pipeline_todos - TODO string in nextflow.config: Optionally, you can add a pipeline-specific nf-core config at https://github.com/nf-core/configs
  • pipeline_todos - TODO string in nextflow.config: Update the field with the details of the contributors to your pipeline. New with Nextflow version 24.10.0
  • pipeline_todos - TODO string in methods_description_template.yml: #Update the HTML below to your preferred methods description, e.g. add publication citation for this pipeline
  • pipeline_todos - TODO string in nextflow.config: Specify any additional parameters here
  • pipeline_todos - TODO string in output.md: Write this documentation describing your workflow's output
  • pipeline_todos - TODO string in awsfulltest.yml: You can customise AWS full pipeline tests as required
  • pipeline_todos - TODO string in base.config: Check the defaults for all processes
  • pipeline_todos - TODO string in base.config: Customise requirements for specific processes.
  • pipeline_todos - TODO string in test_assembly.config: Specify the paths to your test data on nf-core/test-datasets
  • pipeline_todos - TODO string in test_assembly.config: Give any required params for the test so that command line flags are not needed
  • pipeline_todos - TODO string in test_genome.config: Specify the paths to your test data on nf-core/test-datasets
  • pipeline_todos - TODO string in test_genome.config: Give any required params for the test so that command line flags are not needed
  • pipeline_todos - TODO string in test_full.config: Specify the paths to your full test data ( on nf-core/test-datasets or directly in repositories, e.g. SRA)
  • pipeline_todos - TODO string in test_full.config: Give any required params for the test so that command line flags are not needed
  • pipeline_todos - TODO string in environment.yml: List required Conda package(s).
  • pipeline_todos - TODO string in main.nf.test: Once you have added the required tests, please run the following command to build this file:
  • pipeline_todos - TODO string in main.nf: Optionally add in-text citation tools to this list.
  • pipeline_todos - TODO string in main.nf: Optionally add bibliographic entries to this list.
  • pipeline_todos - TODO string in main.nf: Only uncomment below if logic in toolCitationText/toolBibliographyText has been filled!
  • local_component_structure - rna_detection.nf in subworkflows/local should be moved to a SUBWORKFLOW_NAME/main.nf structure
  • local_component_structure - genome_evaluation.nf in subworkflows/local should be moved to a SUBWORKFLOW_NAME/main.nf structure

❔ Tests ignored:

  • files_exist - File is ignored: conf/igenomes.config
  • files_exist - File is ignored: conf/igenomes_ignored.config
  • nextflow_config - Config variable ignored: params.input
  • files_unchanged - File ignored due to lint config: .github/PULL_REQUEST_TEMPLATE.md
  • files_unchanged - File ignored due to lint config: assets/nf-core-seqsubmit_logo_light.png
  • files_unchanged - File ignored due to lint config: docs/images/nf-core-seqsubmit_logo_light.png
  • files_unchanged - File ignored due to lint config: docs/images/nf-core-seqsubmit_logo_dark.png

❔ Tests fixed:

✅ Tests passed:

Run details

  • nf-core/tools version 3.5.1
  • Run at 2026-03-16 12:23:43

// Database preparation
//

// Handle pre-built db: untar if compressed, or use directory directly
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mberacochea that part was not working for input as directory. Directory doesn't have .name in nextflow. I have fixed it by checking instance. Maybe push that fix to nf-core 😱

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I'll fix it there. Can you create the patch please?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Later.. I'm still testing.
One more note: we need to check input DB directory for structure. You are expecting db and tax subfolders inside. It is not obvious and should be checked and described in subwf

@KateSakharova KateSakharova force-pushed the feature/catpack-for-metagenomic-taxa-classification branch from b0b3b81 to 1ffbcb7 Compare March 13, 2026 14:37
@KateSakharova KateSakharova marked this pull request as ready for review March 16, 2026 11:05
@KateSakharova KateSakharova requested a review from ochkalova March 16, 2026 11:07
Comment on lines +21 to +29
withName: 'CATPACK_ADDNAMES_BINS' {
ext.args = '--only_official'
publishDir = [
path: { "${params.outdir}/${params.mode}/taxonomy" },
mode: params.publish_dir_mode,
pattern: "*bin2classification*.txt",
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This publishDir won't work because file is currently called "${meta.id}".txt

Comment on lines +31 to +39
withName: 'CATPACK_SUMMARISE_BINS' {
ext.prefix = "bins_summary"
publishDir = [
path: { "${params.outdir}/${params.mode}/taxonomy" },
mode: params.publish_dir_mode,
pattern: "*_summary.txt",
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we need to run this module? I think it's unnecessary...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was not sure about it... I can make it 'false'

"""
input[0] = [
[ id: 'test' ],
file("${projectDir}/modules/local/rename_fasta_for_catpack/tests/test.fasta", checkIfExists: true)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
file("${projectDir}/modules/local/rename_fasta_for_catpack/tests/test.fasta", checkIfExists: true)
file("${moduleDir}/tests/test.fasta", checkIfExists: true)

Comment on lines +10 to +28
script:
def is_compressed = fasta.name.endsWith('.gz')
extension = '.fasta'
def base = fasta.name
.replaceAll(/\.gz$/, '')
.replaceAll(/\.(fa|fasta|fna)$/, '')
def output_name = base + extension

if (is_compressed) {
"""
mkdir -p output
gunzip -c ${fasta} > output/${output_name}
"""
} else {
"""
mkdir -p output
ln -s ../${fasta} output/${output_name}
"""
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wow, nextflow magic

tuple val(meta), path(fasta)

output:
tuple val(meta), path("output/*.fasta{,.gz}"), emit: renamed_fasta
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You uncompress all gz, so it's always just .fasta

Comment on lines +36 to +40
- "output/*.fasta{,.gz}":
type: file
description: |
Renamed FASTA file with a .fasta extension, decompressed if the input was compressed.
pattern: "output/*.fasta{,.gz}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

nextflow.config Outdated

// NCBI taxonomy
cat_db = null
cat_db_download_id = ''
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why '' and not null? Why it doesn't have default value like checkm2_db_zenodo_id?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because I didn't search for it... 🤦‍♀️

@KateSakharova KateSakharova force-pushed the feature/catpack-for-metagenomic-taxa-classification branch from 3fb2e0a to 3cd524a Compare March 16, 2026 12:22
@KateSakharova KateSakharova merged commit 2b64c73 into dev Mar 16, 2026
12 of 20 checks passed
@KateSakharova KateSakharova deleted the feature/catpack-for-metagenomic-taxa-classification branch March 16, 2026 12:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants